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Abstract 

We give an 0(n 3 + n 2 t) time algorithm to determine whether an NFA with n states 
and t transitions accepts a language of polynomial or exponential growth. We also 
show that given a DFA accepting a language of polynomial growth, we can determine 
the order of polynomial growth in quadratic time. 

1 Introduction 

Let L C S* be a language. If there exists a polynomial p(x) such that \L D Y7 n \ < p(m) for 
all m > 0, then L has polynomial growth. Languages of polynomial growth are also called 
sparse or poly- slender. 

If there exists a real number r > 1 such that \L D E m | > r m for infinitely many m > 0, 
then L has exponential growth. Languages of exponential growth are also called dense. 

If there exist words W\,W2, ■ ■ ■ , Wk € S* such that L C w^w^ ■ ■ ■ w^, then L is called a 
bounded language. 

Ginsburg and Spanier j6] (see Ginsburg [5, Chapter 5]) proved many deep results concern- 
ing the structure of bounded context-free languages. One significant result [3, Theorem 5.5.2] 
is that determining if a context-free grammar generates a bounded language is decidable. 
However, although it is a relatively straightforward consequence of their work, they did not 
make the following connection between the bounded context-free languages and those of 
polynomial growth. 
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Theorem 1. A context-free language is bounded if and only if it has polynomial growth. 

Curiously, this result has been independently discovered at least six times: namely, by 
Trofimov [IT], Latteux and Thierrin [UJ, Ibarra and Ravikumar [Hj, Raz |15| . Incitti [9], 
and Bridson and Gilman [2]. A consequence of all of these proofs is that a context-free 
language has either polynomial or exponential growth; no intermediate growth is possible. 

The particular case of the bounded regular languages was also studied by Ginsburg and 
Spanier [7J, and subsequently by Szilard, Yu, Zhang, and Shallit [16] (see also [8]). It follows 
from the more general decidability result of Ginsburg and Spanier that there is an algorithm 
to determine whether a regular language has polynomial or exponential growth (see also 
[TBI Theorem 5]). Ibarra and Ravikumar [2] observed that the algorithm of Ginsburg and 
Spanier runs in polynomial time for NFA's, but they gave no detailed analysis of the runtime. 
Here we specialize the algorithm of Ginsburg and Spanier to the case of regular languages, 
and we give particular attention to the runtime of this algorithm. We also show how, given 
a DFA accepting a language of polynomial growth as input, one may determine the precise 
order of polynomial growth in polynomial time. 



2 Polynomial vs. exponential growth 

In this section we give an 0(n 3 + n 2 t) time algorithm to determine whether an NFA with n 
states and t transitions accepts a language of polynomial or exponential growth. 

Theorem 2. Given a NFA M, it is possible to test whether L(M) is of polynomial or 
exponential growth in 0(n 3 +n 2 t) time, where n andt are the number of states and transitions 
of M respectively. 

Let M = (Q, E, 5, go, F) be an NFA. We assume that every state of M is both accessible 
and co-accessible, i.e., every state of M can be reached from go and can reach a final state. 
For each state g G Q, we define a new NFA M q = (Q, E, S, g, {g}) and write L q = L(M q ). 

Following Ginsburg and Spanier, we say that a language L C E* is commutative if there 
exists u G E* such that L C u* . 

The following two lemmas have been obtained in more generality in all of the previously 
mentioned proofs of Theorem [1] (compare also Lemmas 5.5.5 and 5.5.6 of Ginsburg [5], or in 
the case of regular languages specified by DFA's, Lemmas 2 and 3 of Szilard et al. [TB]). 

Lemma 3. If L(M) has polynomial growth, then for every q G Q, L q is commutative. 

Proof. A classical result of Lyndon and Schutzenberger [12] implies that if a set of words 
X does not satisfy X C u* for any word u, then there exist x, y G X such that xy ^ yx. 
Suppose then that L(M) has polynomial growth, but for some L q there exists x,y G L q , 
xy 7^ yx. Let v be any word such that q G 5(q ,v), and let v' be any word such that 
5(q, v ') H F 7^ 0. Then for every m > 0, the set v (xy + yx) m v' consists of 2 m distinct words 
of length \vv'\ + m|;q/| in L(M). It follows that L(M) has exponential growth, contrary to 
our assumption. □ 
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Lemma 4. If for every q G Q, L q is commutative, then L(M) has polynomial growth. 



Proof. We prove by induction on the number n of states of M that the hypothesis of the 
lemma implies that L(M) is bounded. It is well-known that any bounded language has 
polynomial growth (see, for example, [H2, Proposition 1]). Clearly the result holds for n — 1. 
We suppose then that n > 1. 

Let Q' = Q\{qo}, F' = F\{q }, and S'(q, a) = S(q, a) \{qo} for all q G Q' and a G S. For 
each q G Q', we define an NFA N q = (Q 1 , E, 5', q, F'), and we write A q = L(N q ). Applying 
the induction hypothesis to N q , we conclude that A q is bounded. 

The key observation is that L(M) = L\ U L 2 , where 



By assumption, L qo C u* for some u G £*, and, as previously noted, by the induction 
hypothesis each of the languages A q is bounded. It follows that L(M) is a finite union of 
bounded languages, and hence is itself bounded. We conclude that L(M) has polynomial 



We now are ready to prove Theorem [2j 

Proof. Let n denote the number of states of M. The idea is as follows. For every q G Q, if 
L q is commutative, then there exists u G X* such that L g C u*. For any w G L g , we thus 
have w G m*. If z is the primitive root of w, then z is also the primitive root of u. If L q C z*, 
then L g is commutative. On the other hand, if L q (£ z*, then L q contains two words with 
different primitive roots, and is thus not commutative. This argument leads to the following 
algorithm. 

For each q G Q we perform the following steps. 

• Construct the NFA M q accepting L q . This takes 0(n + t) time. 

• Find a word w G L(M q ), where \w\ < n. If L(M q ) is non-empty, such a w 
exists and can be found in 0(n + t) time. 

• Find the primitive root of w, i.e., the shortest word z such that w = z k for 
some k > 1. This can be done in 0(n) time using the Knuth-Morris-Pratt 
algorithm. To find the primitive root of w — W\ ■ ■ ■ we, use Knuth-Morris- 
Pratt to find the first occurrence of w in w 2 • • -w^Wi ■ ■ -w^i. If the first 
occurrence begins at position i, then z — W\ ■ ■ ■ Wi-\ is the primitive root of 
to. 




and 




growth, as required. 



□ 
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• Apply the cross product construction to obtain an NFA M' that accepts 
L q \z*. The NFA M' has 0(n 2 ) states and 0(nt) transitions. 

• Test whether L(M') is empty or not. If L(M') is non-empty, then by 
Lemma [3] the growth of L(M) is exponential. If L(M') is empty, then 
L q is commutative. This step takes 0(n 2 + nt) time. 

If for all q G Q we have verified that L q is commutative, then by Lemma H]L(M) 
has polynomial growth. 

The runtime of this algorithm is 0(n 3 + n 2 t). □ 

3 Finding the exact order of polynomial growth 

In this section we show that given a DFA accepting a language of polynomial growth, it is 
possible to efficiently determine the exact order of polynomial growth. We give two different 
algorithms: one combinatorial, the other algebraic. 

Szilard et al. [T6| Theorem 4] proved a weaker result: namely, that given a regular 
language L and an integer d > it is decidable whether L has 0(m d ) growth. However, 
even if L is specified by a DFA, their algorithm takes exponential time. 

3.1 A combinatorial algorithm 

Theorem 5. Given a DFA M with n states such that L(M) is of polynomial growth, it is 
possible to determine the exact order of polynomial growth in 0(n 2 ) time. 

Proof. Let M = (Q, E, 5, go, F). Again we assume that every state of M is both accessible 
and co-accessible. Since L(M) is of polynomial growth, by Lemma [3] M has the property 
that for every q G Q there exists u G £* such that L q Cm*. 

Since M is deterministic, for any state q of M, if there exists a non-empty word w that 
takes M from state q back to q, the smallest such word w is unique. There is also a unique 
cycle of states of M associated with such a word w, and all such cycles in M are disjoint. 

We now contract (in the standard graph-theoretical sense) each such cycle to a single 
vertex and mark this vertex as special. If any vertex on a contracted cycle was final, we also 
mark the new special vertex as final. Since all the cycles in M are disjoint, after contracting 
all of them, the transition graph of the automaton M now becomes a directed acyclic graph 
(DAG) D. A path in D from the start vertex to a final vertex that visits special vertices 
Qi, Q2, ■ ■ ■ , Qk corresponds to a family of words in L(M) of the form 

xiy{x 2 y* 2 - ■ -x k y* k x k+1 , (1) 

where the yiS are words labeling the cycles in M corresponding to the Q^s in D. Note that 
if a cycle in M is of size t, there could be up to t possible choices for the corresponding 

There are only finitely many paths in D, and only finitely many choices for the x^'s and 
y^s in a decomposition of the form given by flTJ). It follows that L(M) is a finite union of 
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languages of the form X\y\x2y\ • ■ •XkV%Xk+x- We have thus recovered the characterization of 
Szilard et al. [16]. It is well-known that any language of this form has 0(m k ~ 1 ) growth (see, 
for example, p22 Lemma 4]). 

Consider a path through D from the start vertex to a final vertex that visits the maximum 
number d of special vertices. Then we may conclude that the order of growth of L(M) is 
0(m d_1 ). This observation leads to our desired algorithm. 

We first identify all the cycles in M and contract them to obtain a DAG D, as previously 
described. It remains to find a path through D from the start vertex to a final vertex that 
visits the largest number of special vertices. The LONGEST PATH problem for general 
graphs is NP-hard; however, in the case of a DAG, it can be solved in linear time by a 
simple dynamic programming algorithm. To obtain our result, we modify this dynamic 
programming algorithm by adjusting our distance metric so that the length of a path is not 
the number of edges on it, but rather the number of special vertices on the path. The most 
computationally intensive part of this algorithm is finding and contracting the cycles in M, 
which can be done in 0(n 2 ) time. □ 

3.2 An algebraic approach 

We now consider an algebraic approach to determining whether the order of growth is poly- 
nomial or exponential, and in the polynomial case, the order of polynomial growth. Let 
M = (Q, E, S, qo, F), where \Q\ = n, and let A = A(M) = {o-ij)i<i,j< n be the adjacency 
matrix of M, that is, denotes the number of paths of length 1 from to q,j. Then (A m )ij 
counts the number of paths of length m from to qj. Since a final state is reachable from 
every state qj, the order of growth of L(M) is the order of growth of A m as m — > oo. This 
order of growth can be estimated using nonnegative matrix theory. 

Theorem 6 (Perron- Frobenius). Let A be a nonnegative square matrix, and let r be the 
spectral radius of A, i.e., r = max{|A| : A is an eigenvalue of A}. Then 

1. r is an eigenvalue of A; 

2. there exists a positive integer h such that any eigenvalue X of A with |A| = r satisfies 
A' 1 = r' 1 

For more details, see [T3], Chapters 1, 3]. 

Definition 1. The number r = r(A) described in the above theorem is called the Perron- 
Frobenius eigenvalue of A. The dominating Jordan block of A is the largest block in the 
Jordan decomposition of A associated with r(A). 

Lemma 7. Let A be a nonnegative n x n matrix over the integers. Then either r(A) = or 
r(A) > 1. 

Proof. Let r(A) — r, X% ■ ■ ■ , A^ be the distinct eigenvalues of A, and suppose that r < 1. 
Then linin^oo r m = lin^^oo A™ - = for all i = 1, . . . , £, and so lim-m^oo A m = (the zero 
matrix). But A m is an integral matrix for all m G N, and the above limit can hold if and 
only if A is nilpotent, i.e., r = Aj = for alH = 1, . . . , I. □ 
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Lemma 8. Let A be a nonnegative n x n matrix over the integers. Let r(A) = r, Ai, . . . , A^ 
be the distinct eigenvalues of A, and let d be the size of the dominating Jordan block of A. 
Then A m e Q(r r 



d-V 



m 

Proof. The theorem trivially holds for r = 0. Assume r > 1. Without loss of generality, we 
can assume that A does not have an eigenvalue A such that A ^ r and |A| = r; if such an 
eigenvalue exists, replace A by A h (see Theorem [6]). Let J be the Jordan canonical form of 
A, i.e., A = SJS^ 1 , where 5* is a nonsingular matrix, and J is a diagonal block matrix of 
Jordan blocks. We use the following notation: J\ e is a Jordan block of order e corresponding 
to eigenvalue A, and O x is a square matrix, where all entries are zero, except for x at the top- 
right corner. Let J r ^ be the dominating Jordan block of A. It can be verified by induction 
that 
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All Jordan blocks other than the dominating block converge to zero blocks, and 
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The result follows. 



□ 



Note: The growth order of A m supplies an algebraic proof of the fact that regular 
languages can grow either polynomially or exponentially, but no intermediate growth order 
is possible. This result can also be derived from a more general matrix theoretic result of 
Bell [TJ. 

Lemma [S] implies that to determine the order of growth of L(M), we need to compute 
the Perron- Frobenius eigenvalue r of A(M): if r = 0, then L(M) is finite; if r = 1, the order 
of growth is polynomial; if r > 1, the order of growth is exponential. In the polynomial 
case, if we want to determine the order of polynomial growth, we need to also compute the 
size of the dominating Jordan block, which is the algebraic multiplicity of r in the minimal 
polynomial of A(M). 

Both computations can be done in polynomial time, though the runtime is more than 
cubic. The characteristic polynomial, ca{x), can be computed in O (n 4 log \\A\\) bit operations 
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(here O stands for soft-O, and ||A|| stands for the norm of A). If ca{x) = x n then r = 0; 
else, if ca{1) ^ 0, then r > 1. In the case of ca(1) = 0, we need to check whether ca(x) 
has a real root in the open interval (l,oo). This can be done using a real root isolation 
algorithm; it seems the best deterministic one uses 0(n 6 log 2 \\A\\) bit operations [3]. The 
minimal polynomial, 171a{x), can be computed through the rational canonical form of A 
in 0(n 5 log \\A\\) bit operations (see references in All algorithms mentioned above are 
deterministic; both ca{x) and 7tia{x) can be computed in 0(n 2,697263 log \\A\\) bit operations 
using a randomized Monte Carlo algorithm [TO] . 

An interesting problem is the following: given a nonnegative integer matrix A, is it possi- 
ble to decide whether r(A) > 1 in time better than 0(n 6 log 2 ||^4||)? Using our combinatorial 
algorithm, we can do it in time 0(n 4 ||A||), by interpreting A as the adjacency matrix of 
a DFA over an alphabet of size and applying the algorithm to each of the connected 
components of A separately. It would be interesting to find an algorithm polynomial in 
log ||A||. 
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